This document contains the results and analysis of my second experiment for my first Qualifying Paper towards the PhD in Linguistics at Stanford University1.
The experiment analyzed herein was an online maze task study, wherein participants read a sentence such as “David is a congressperson from Virginia. He likes cycling.” For each of the 20 critical items, participants were randomly assigned to one of four conditions, the critical regions of which are enumerated and exemplified here:
Participants proceeded through the sentences by pressing a keyboard key which corresponded to the grammatical continuation of the sentence, and reading pair of grammatical and distractor items at a time. They were then asked attention check questions at the end of each sentence; attention check questions never explicitly invoked gender. Rather, they asked about the character’s home states. For a hands-on look at the experiment, you can click here to go to the same webpage particpants were directed to for the task.
We originally ran the experiment on 200 participants, recruited through the online participant recruitment platform Prolific. The mean time of the experiment was 5.39 minutes, and participants were paid $1.75 for their participation2. The only restrictions placed on participants were that they:
These requirements were implemented in order to assure that speakers came from at least somewhat similar linguistic backgrounds, as certain lexical items in the study (such as congressperson) are quite localized to the United States.
After this initial run of the experiment, we found that there was a dearth of conservative or Republican-aligned participants. As a result, we ran the experiment again, this time on 98 self-identified Republicans. This was achieved by adding a filter on Prolific so that only Republican-identified individuals could see the task. The rest of the experiment, including payment, was exactly the same, except that an additional disclaimer that participants could not use the FireFox browser experiment, after the first run revealed an incompatibility between JavaScript and FireFox. The two runs of the experiment amounted in a total of 298 participants who completed the task.
Before we can do much of anything with the data, we need to make sure it’s usable! This means filtering out all unimportant or extraneous trials, running exclusion criteria, and adding additional trial and item-level data that will be necessary later in the analysis.
For this analysis, we require the following packages:
library(ggplot2)
library(tidyverse)
library(lme4)
library(stringr)
library(languageR)
library(lmerTest)
library(reshape2)
source("helpers.R")
theme_set(theme_minimal())
In the first instance, we need to read in the data, which has been pre-merged from both runs of the experiment. We will also in this chunk filter out all of the example trials, as well as all the data points that are from non-critical trials.
all_data <- read.csv('merged_all.csv') %>%
filter(trial_id!= 'example') %>%
filter(region=='critical')
Now, we want to exclude any participants who failed to answer at least 80% of the attention check questions correctly. We do this by creating a list of all participants who scored less than 80% on these checks, and then cross-referencing this list with all data points, removing any data points whose participants were in the exclusion list.
exclusion <- all_data %>% group_by(workerid) %>%
summarise(accuracy = mean(response_correct)) %>%
mutate(exclude = ifelse(accuracy < 0.80,'Yes','No')) %>%
filter(exclude == 'Yes')
all_data <- all_data[!(all_data$workerid %in% exclusion$workerid),] %>%
filter(rt !='null')
We also want to filter out all trials in which the reading time for the critical item was more than 2.5 standard deviations from the mean reading time on that lexical item across all participants.
all_data <- all_data %>% group_by(trial_id) %>% mutate(id_mean = mean(log(rt))) %>% mutate(exclusion = (log(rt) < mean(log(rt)) - 2sd(log(rt))|(log(rt) > mean(log(rt)) + 2sd(log(rt))))) %>% ungroup() %>% filter(exclusion==FALSE)
This results in 238 trials being removed from the 5580 we got after the by-participant exclusions. We now have 5342 trials we can use for analysis.
Now that we have only the rows we want, let’s add some new columns, which will contain important information for each data point. Here, we will be adding:
Ideally, I would’ve added all of these but the first when I actually created the stimuli and logged responses, but I forgot to! Luckily, R allows us to do this post-hoc fairly straightforwardly… which is good, since these features will be critical in our data visualization and analysis.
The question under investigation here is whether or not individuals’ conceptions of gender affect how they process gendered and gender-neutral forms of English personal and professional titles.
In order to examine this, we need to quanify participants’ ideological views! Here we have adopted the 13-item Social Roles Questionnaire put forth in Baber & Tucker (2006). Questions 1-5 correlate to the ‘Gender Transcendent’ subscale, and questions 6-13 correspond to the ‘Gender Linked’ subscale. Each item is scored on a scale of 0-100. So, the first thing we want to do is make two lists of columns which correspond to these two subscales, since the questions are stored individually in the data:
gender_transcendence_cols <- c('subject_information.gender_q1','subject_information.gender_q2','subject_information.gender_q3','subject_information.gender_q4','subject_information.gender_q5')
gender_linked_cols <- c('subject_information.gender_q6','subject_information.gender_q7','subject_information.gender_q8','subject_information.gender_q9','subject_information.gender_q10','subject_information.gender_q11','subject_information.gender_q12','subject_information.gender_q13')
Now we can use the mutate() method on all_data to add two new columns, one for each subscale. We tell R to take the means of the specified columns in [column_names] of all_data for each individual row: rowMeans(all_data[column_names]). We also have to subtract this mean from 100 in the case of the ‘Gender Transcendent’ subscale, since it is inversely scored. Finally, we can create an average total score regardless of subscores, simply by meaning the two subscores we already have.
all_data <- all_data %>%
mutate(gender_trans = 100 - (rowMeans(all_data[gender_transcendence_cols]))) %>%
mutate(gender_link = rowMeans(all_data[gender_linked_cols]))
gender_all = c('gender_trans','gender_link')
all_data <- all_data %>%
mutate(gender_total = rowMeans(all_data[gender_all]))
We also want to add whether the trial included a female or male referent (but also, like, destroy the binary!). In order to do this, we’ll just add a trial_gender column that says ‘female’ if the condition was either ‘neutral_female’ or ‘congruent_female’. Otherwise, we want the trial_gender to say ‘male’.
all_data <- all_data %>%
mutate(trial_gender = ifelse(condition=='neutral_female' | condition == 'congruent_female','female','male'))
all_data %>%
select(workerid,rt,condition,trial_id,trial_gender)
Now we want to add whether or not the lexeme’s neutral form is developed by compounding (as in ‘congress-person’) or by the adoption of the male form (as in ‘actor’ being used more for both men and women). In this study, we only have six lexemes of the latter type, so we’ll just tell R to assign those a morph_type value of ‘adoption’ (for ‘male adoption’), and all else will be assigned a value of ‘compound’.
all_data <- all_data%>%
mutate(morph_type = ifelse(lexeme!= 'actor' & lexeme!= 'host' & lexeme !='hunter' & lexeme!= 'villain' & lexeme!= 'heir' & lexeme!= 'hero','compound','adoption'))
all_data %>%
select(rt,lexeme,morph_type)
Another important factor we want to explore is the length of the critical item! In order to add this, we simply create a new column form_length and tell R to input as that column’s value the length of the string that appears in that row’s form column, which corresponds to the orthograpic form of the critical item in that trial. Note that this will include spaces in the count!
all_data <- all_data %>%
mutate(form_length = str_length(form))
simple_model <- lm(log(rt)~form_length, data = all_data)
all_data <- all_data %>%
mutate(resid_rt = resid(simple_model))
summary(simple_model)
Call:
lm(formula = log(rt) ~ form_length, data = all_data)
Residuals:
Min 1Q Median 3Q Max
-0.8807 -0.3063 -0.0720 0.2406 3.2023
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.965847 0.022117 314.958 < 2e-16 ***
form_length 0.011405 0.002271 5.022 5.37e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4307 on 3411 degrees of freedom
Multiple R-squared: 0.00734, Adjusted R-squared: 0.007049
F-statistic: 25.22 on 1 and 3411 DF, p-value: 5.374e-07
Now that we have these, we can run a simple linear regression which will show us the effect of orthographic length on reading time. Then we add a new column in the data which is the residual reading time, or the reading time in log space AFTER we control for the effects of orthographic length.
We also want to make sure we have a column which records whether or not the trial was gender-congruent (as in ‘Shelby is a congresswoman’) or gender neutral (as in ‘Shelby is a congressperson’). We add a trial_congruency column, which is valued as ‘congruent’ if that row’s condition is one of the two congruent conditions. Otherwise, it gets valued as ‘neutral’.
all_data <- all_data %>%
mutate(trial_congruency = ifelse(condition=='congruent_male' | condition == 'congruent_female','congruent','neutral'))
Finally, we can classify participants by their particular political alignment; we can construe this broadly as “Republicans” vs. “Democrats”, with those who declined to state a preference, or placed themselves in the middle, as “Non-Partisan”.
all_data <- all_data %>%
mutate(poli_party = ifelse(subject_information.party_alignment == 1 | subject_information.party_alignment == 2,'Republican',ifelse(subject_information.party_alignment == 4 | subject_information.party_alignment == 5,'Democrat','Non-Partisan')))
Now we can start analysing the data by means of data visualization.
inauguration_2021 = c("#5445b1", "#749dae", "#f3c483", "#5c1a33", "#cd3341","#f7dc6a")
ggplot(all_data, aes(x=subject_information.age, y=gender_total, color=poli_party)) +
geom_point(alpha=.5) +
geom_smooth(method = 'lm', size=1.2)
`geom_smooth()` using formula 'y ~ x'
Warning: Removed 20 rows containing non-finite values (stat_smooth).
Warning: Removed 20 rows containing missing values (geom_point).
ggplot(all_data, aes(x=gender_total, y=resid_rt, color=trial_congruency)) +
geom_point(alpha=.5) +
geom_smooth(method = 'lm', size=1.2) +
theme_minimal()
`geom_smooth()` using formula 'y ~ x'
ggplot(all_data, aes(x=subject_information.age, y=resid_rt, color=trial_congruency, linetype=morph_type)) +
geom_point(alpha=.5) +
geom_smooth(method = 'lm', size=1.2)
`geom_smooth()` using formula 'y ~ x'
Warning: Removed 20 rows containing non-finite values (stat_smooth).
Warning: Removed 20 rows containing missing values (geom_point).
agg_speaker_mean_con <- all_data %>%
group_by(condition,workerid) %>%
summarize(MeanRT=mean(resid_rt))
`summarise()` has grouped output by 'condition'. You can override using the `.groups` argument.
all_data %>%
group_by(condition,trial_gender) %>%
summarize(MeanRT = mean(resid_rt), CI.Low = ci.low(resid_rt), CI.High = ci.high(resid_rt)) %>%
mutate(YMin = MeanRT - CI.Low, YMax = MeanRT + CI.High) %>%
ggplot(aes(x=condition,y=MeanRT,color=trial_gender)) +
geom_point(size=3) +
geom_jitter(data = agg_speaker_mean_con, aes(y=MeanRT),alpha=.1,color='darkred') +
geom_errorbar(aes(ymin=YMin,ymax=YMax), width=.25) +
scale_color_manual(values = inauguration_2021) +
theme_minimal()
`summarise()` has grouped output by 'condition'. You can override using the `.groups` argument.
all_data %>%
group_by(condition,trial_gender,trial_congruency,lexeme) %>%
summarize(MeanRT = mean(resid_rt), CI.Low = ci.low(resid_rt), CI.High = ci.high(resid_rt)) %>%
mutate(YMin = MeanRT - CI.Low, YMax = MeanRT + CI.High) %>%
ggplot(aes(x=condition,y=MeanRT,color=trial_gender,shape=trial_congruency)) +
geom_point(size=3) +
geom_errorbar(aes(ymin=YMin,ymax=YMax), width=.25) +
facet_wrap(~ lexeme) +
theme(axis.text.x = element_text(angle = 45, vjust = .7, hjust=.7)) +
scale_color_manual(values = inauguration_2021) +
facet_wrap(~lexeme)
`summarise()` has grouped output by 'condition', 'trial_gender', 'trial_congruency'. You can override using the `.groups` argument.
all_data %>%
filter(morph_type == "adoption") %>%
group_by(condition,trial_gender,trial_congruency,lexeme) %>%
summarize(MeanRT = mean(resid_rt), CI.Low = ci.low(resid_rt), CI.High = ci.high(resid_rt)) %>%
mutate(YMin = MeanRT - CI.Low, YMax = MeanRT + CI.High) %>%
ggplot(aes(x=condition,y=MeanRT,color=trial_gender,shape=trial_congruency)) +
geom_point(size=3) +
geom_errorbar(aes(ymin=YMin,ymax=YMax), width=.25) +
facet_wrap(~ lexeme) +
theme(axis.text.x = element_text(angle = 45, vjust = .7, hjust=.7)) +
scale_color_manual(values = inauguration_2021) +
facet_wrap(~lexeme)
`summarise()` has grouped output by 'condition', 'trial_gender', 'trial_congruency'. You can override using the `.groups` argument.
all_data %>%
filter(morph_type == "compound") %>%
group_by(condition,trial_gender,trial_congruency,lexeme) %>%
summarize(MeanRT = mean(resid_rt), CI.Low = ci.low(resid_rt), CI.High = ci.high(resid_rt)) %>%
mutate(YMin = MeanRT - CI.Low, YMax = MeanRT + CI.High) %>%
ggplot(aes(x=condition,y=MeanRT,color=trial_gender,shape=trial_congruency)) +
geom_point(size=3) +
geom_errorbar(aes(ymin=YMin,ymax=YMax), width=.25) +
facet_wrap(~ lexeme) +
theme(axis.text.x = element_text(angle = 45, vjust = .7, hjust=.7)) +
scale_color_manual(values = inauguration_2021) +
facet_wrap(~lexeme)
`summarise()` has grouped output by 'condition', 'trial_gender', 'trial_congruency'. You can override using the `.groups` argument.
temp <- all_data %>%
group_by(trial_gender) %>%
summarize(MeanRT = mean(rt), CI.Low = ci.low(rt), CI.High = ci.high(rt)) %>%
mutate(YMin = MeanRT - CI.Low, YMax = MeanRT + CI.High)
dodge = position_dodge(.9)
ggplot(data=temp, aes(x=trial_gender,y=MeanRT,fill=trial_gender)) +
geom_bar(stat='identity',position=dodge) +
geom_errorbar(aes(ymin=YMin,ymax=YMax),width=.25,position=dodge) +
theme(legend.position = 'none')
dodge = position_dodge(.9)
all_data %>%
group_by(trial_gender) %>%
summarize(MeanRT = mean(resid_rt)) %>%
ggplot(aes(x=trial_gender,y=MeanRT,fill=trial_gender)) +
geom_bar(stat='identity',position=dodge) +
theme(legend.position = 'none')
dodge = position_dodge(.9)
all_data %>%
group_by(trial_congruency) %>%
summarize(MeanRT = mean(resid_rt)) %>%
ggplot(aes(x=trial_congruency,y=MeanRT,fill=trial_congruency)) +
geom_bar(stat='identity',position=dodge) +
theme(legend.position = 'none')
all_data %>%
group_by(condition,trial_gender,morph_type) %>%
summarize(MeanRT = mean(resid_rt), CI.Low = ci.low(resid_rt), CI.High = ci.high(resid_rt)) %>%
mutate(YMin = MeanRT - CI.Low, YMax = MeanRT + CI.High) %>%
ggplot(aes(x=condition,y=MeanRT,color=trial_gender)) +
geom_point(size=3) +
geom_errorbar(aes(ymin=YMin,ymax=YMax), width=.25) +
scale_color_manual(values = inauguration_2021) +
facet_wrap(~morph_type) +
theme(axis.text.x = element_text(angle=45, vjust = 0.5))
`summarise()` has grouped output by 'condition', 'trial_gender'. You can override using the `.groups` argument.
agg_speaker_trial <- all_data %>%
group_by(condition,workerid) %>%
summarize(MeanRT=mean(resid_rt))
`summarise()` has grouped output by 'condition'. You can override using the `.groups` argument.
all_data %>%
group_by(trial_no) %>%
summarise(MeanRT = mean(rt)) %>%
ggplot(aes(x=trial_no,y=MeanRT)) +
geom_point() +
geom_smooth() +
theme_minimal()
`geom_smooth()` using method = 'loess' and formula 'y ~ x'
test_model <- lm(resid_rt~trial_congruency*morph_type, data=all_data)
summary(test_model)
Call:
lm(formula = resid_rt ~ trial_congruency * morph_type, data = all_data)
Residuals:
Min 1Q Median 3Q Max
-0.8969 -0.3008 -0.0732 0.2419 3.1890
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.04150 0.01917 2.165 0.030431 *
trial_congruencyneutral -0.02822 0.02683 -1.052 0.292930
morph_typecompound -0.09979 0.02279 -4.378 1.23e-05 ***
trial_congruencyneutral:morph_typecompound 0.12246 0.03205 3.821 0.000135 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4285 on 3409 degrees of freedom
Multiple R-squared: 0.01041, Adjusted R-squared: 0.009535
F-statistic: 11.95 on 3 and 3409 DF, p-value: 8.817e-08
test_model2 <- lm(resid_rt~trial_gender, data=all_data)
summary(test_model2)
Call:
lm(formula = resid_rt ~ trial_gender, data = all_data)
Residuals:
Min 1Q Median 3Q Max
-0.8772 -0.3069 -0.0719 0.2413 3.1987
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.003498 0.010417 -0.336 0.737
trial_gendermale 0.007006 0.014743 0.475 0.635
Residual standard error: 0.4307 on 3411 degrees of freedom
Multiple R-squared: 6.619e-05, Adjusted R-squared: -0.000227
F-statistic: 0.2258 on 1 and 3411 DF, p-value: 0.6347
compounds_only <- all_data %>%
filter(morph_type == 'compound')
compounds_only <- compounds_only %>%
mutate(ctrial_congruency = as.numeric(as.factor(trial_congruency))-mean(as.numeric(as.factor(trial_congruency)))) %>%
mutate(ctrial_gender = as.numeric(as.factor(trial_gender))-mean(as.numeric(as.factor(trial_gender)))) %>%
mutate(cgender_link = scale(gender_link)) %>%
mutate(cgender_total = scale(gender_total))
compound_model <- lmer(resid_rt~ctrial_congruency*ctrial_gender*cgender_total + (1|workerid) + (1|lexeme),data = compounds_only)
summary(compound_model)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: resid_rt ~ ctrial_congruency * ctrial_gender * cgender_total +
(1 | workerid) + (1 | lexeme)
Data: compounds_only
REML criterion at convergence: 1747.2
Scaled residuals:
Min 1Q Median 3Q Max
-2.9415 -0.6037 -0.1168 0.4786 6.6039
Random effects:
Groups Name Variance Std.Dev.
workerid (Intercept) 0.071787 0.26793
lexeme (Intercept) 0.009879 0.09939
Residual 0.098696 0.31416
Number of obs: 2392, groups: workerid, 175; lexeme, 14
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) -0.01243 0.03402 30.36022 -0.365 0.7174
ctrial_congruency 0.09596 0.01301 2208.36825 7.377 2.27e-13 ***
ctrial_gender 0.02720 0.01302 2209.23896 2.089 0.0368 *
cgender_total 0.05503 0.02126 173.18802 2.589 0.0104 *
ctrial_congruency:ctrial_gender -0.03472 0.02608 2210.39826 -1.331 0.1832
ctrial_congruency:cgender_total 0.01164 0.01300 2206.77070 0.895 0.3707
ctrial_gender:cgender_total -0.01642 0.01305 2209.79268 -1.259 0.2082
ctrial_congruency:ctrial_gender:cgender_total -0.02910 0.02607 2209.09340 -1.116 0.2646
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) ctrl_c ctrl_g cgndr_ ctrl_cngrncy:ct_ ctrl_cngrncy:cg_ ctrl_g:_
ctrl_cngrnc 0.000
ctrial_gndr 0.000 -0.010
cgender_ttl 0.001 0.003 0.000
ctrl_cngrncy:ct_ -0.002 0.001 0.003 -0.001
ctrl_cngrncy:cg_ 0.002 -0.006 -0.002 0.011 0.001
ctrl_gndr:_ 0.000 -0.001 0.004 0.001 0.016 -0.011
ctrl_cn:_:_ 0.000 0.002 0.015 -0.003 0.001 0.004 0.039
adoptions_only <- all_data %>%
filter(morph_type == 'adoption')
adoptions_only <- adoptions_only %>%
mutate(ctrial_congruency = as.numeric(as.factor(trial_congruency))-mean(as.numeric(as.factor(trial_congruency)))) %>%
mutate(ctrial_gender = as.numeric(as.factor(trial_gender))-mean(as.numeric(as.factor(trial_gender)))) %>%
mutate(cgender_link = scale(gender_link)) %>%
mutate(cgender_total = scale(gender_total))
adoptions_model <- lmer(resid_rt~ctrial_congruency*ctrial_gender*cgender_total + (1|workerid) + (1|lexeme) + (1|name),data = adoptions_only)
boundary (singular) fit: see ?isSingular
summary(adoptions_model)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: resid_rt ~ ctrial_congruency * ctrial_gender * cgender_total +
(1 | workerid) + (1 | lexeme) + (1 | name)
Data: adoptions_only
REML criterion at convergence: 908.1
Scaled residuals:
Min 1Q Median 3Q Max
-2.5128 -0.6018 -0.1059 0.5109 7.0153
Random effects:
Groups Name Variance Std.Dev.
workerid (Intercept) 0.05710 0.2390
name (Intercept) 0.00000 0.0000
lexeme (Intercept) 0.02727 0.1651
Residual 0.10582 0.3253
Number of obs: 1021, groups: workerid, 175; name, 24; lexeme, 6
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.02989 0.07054 5.72225 0.424 0.68722
ctrial_congruency -0.02857 0.02142 886.60879 -1.334 0.18270
ctrial_gender -0.01961 0.02153 892.34158 -0.911 0.36254
cgender_total 0.05990 0.02075 171.22047 2.887 0.00439 **
ctrial_congruency:ctrial_gender -0.01332 0.04317 892.38092 -0.309 0.75775
ctrial_congruency:cgender_total -0.01599 0.02131 877.46721 -0.750 0.45331
ctrial_gender:cgender_total 0.00566 0.02149 885.47768 0.263 0.79233
ctrial_congruency:ctrial_gender:cgender_total 0.02592 0.04296 881.94051 0.603 0.54650
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) ctrl_c ctrl_g cgndr_ ctrl_cngrncy:ct_ ctrl_cngrncy:cg_ ctrl_g:_
ctrl_cngrnc 0.000
ctrial_gndr 0.000 0.026
cgender_ttl -0.002 -0.007 0.004
ctrl_cngrncy:ct_ 0.003 -0.003 0.006 0.002
ctrl_cngrncy:cg_ -0.002 -0.017 0.005 -0.039 0.020
ctrl_gndr:_ 0.001 0.005 0.000 0.000 -0.002 0.022
ctrl_cn:_:_ 0.001 0.019 0.000 0.011 -0.002 -0.002 -0.068
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see ?isSingular
all_data <- all_data %>%
mutate(ctrial_congruency = as.numeric(as.factor(trial_congruency))-mean(as.numeric(as.factor(trial_congruency)))) %>%
mutate(ctrial_gender = as.numeric(as.factor(trial_gender))-mean(as.numeric(as.factor(trial_gender)))) %>%
mutate(cgender_link = scale(gender_link)) %>%
mutate(cgender_total = scale(gender_total)) %>%
mutate(cmorph_type = as.numeric(as.factor(morph_type))-mean(as.numeric(as.factor(morph_type))))
complex_model <- lmer(resid_rt~ctrial_congruency*ctrial_gender*cgender_total + (1|workerid) + (1|lexeme),data = all_data)
summary(complex_model)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: resid_rt ~ ctrial_congruency * ctrial_gender * cgender_total +
(1 | workerid) + (1 | lexeme)
Data: all_data
REML criterion at convergence: 2471
Scaled residuals:
Min 1Q Median 3Q Max
-3.6906 -0.6269 -0.1267 0.4992 7.6516
Random effects:
Groups Name Variance Std.Dev.
workerid (Intercept) 0.06669 0.2582
lexeme (Intercept) 0.01412 0.1188
Residual 0.10240 0.3200
Number of obs: 3413, groups: workerid, 175; lexeme, 20
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.959e-04 3.342e-02 4.231e+01 0.006 0.99535
ctrial_congruency 5.729e-02 1.098e-02 3.214e+03 5.218 1.92e-07 ***
ctrial_gender 1.042e-02 1.098e-02 3.215e+03 0.949 0.34267
cgender_total 5.580e-02 2.027e-02 1.730e+02 2.753 0.00654 **
ctrial_congruency:ctrial_gender -2.174e-02 2.198e-02 3.215e+03 -0.989 0.32270
ctrial_congruency:cgender_total 5.421e-03 1.098e-02 3.214e+03 0.494 0.62152
ctrial_gender:cgender_total -1.135e-02 1.100e-02 3.215e+03 -1.032 0.30229
ctrial_congruency:ctrial_gender:cgender_total -8.751e-03 2.199e-02 3.215e+03 -0.398 0.69065
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) ctrl_c ctrl_g cgndr_ ctrl_cngrncy:ct_ ctrl_cngrncy:cg_ ctrl_g:_
ctrl_cngrnc 0.000
ctrial_gndr 0.000 -0.002
cgender_ttl -0.001 0.001 0.000
ctrl_cngrncy:ct_ 0.000 0.000 0.002 0.000
ctrl_cngrncy:cg_ 0.001 -0.001 -0.001 0.001 0.001
ctrl_gndr:_ 0.000 0.000 0.001 0.001 0.003 -0.002
ctrl_cn:_:_ 0.000 0.002 0.003 0.000 -0.001 0.002 0.004
ideology_model <- lmer(resid_rt~ctrial_congruency*ctrial_gender*cgender_link + (1|workerid) + (1|lexeme), data=all_data)
summary(ideology_model)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: resid_rt ~ ctrial_congruency * ctrial_gender * cgender_link +
(1 | workerid) + (1 | lexeme)
Data: all_data
REML criterion at convergence: 2469.7
Scaled residuals:
Min 1Q Median 3Q Max
-3.6650 -0.6231 -0.1237 0.4958 7.6201
Random effects:
Groups Name Variance Std.Dev.
workerid (Intercept) 0.06572 0.2564
lexeme (Intercept) 0.01414 0.1189
Residual 0.10244 0.3201
Number of obs: 3413, groups: workerid, 175; lexeme, 20
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.834e-04 3.336e-02 4.188e+01 0.005 0.99564
ctrial_congruency 5.731e-02 1.098e-02 3.215e+03 5.219 1.91e-07 ***
ctrial_gender 1.045e-02 1.098e-02 3.215e+03 0.952 0.34129
cgender_link 6.373e-02 2.008e-02 1.732e+02 3.174 0.00178 **
ctrial_congruency:ctrial_gender -2.168e-02 2.198e-02 3.215e+03 -0.986 0.32403
ctrial_congruency:cgender_link 5.535e-03 1.098e-02 3.215e+03 0.504 0.61433
ctrial_gender:cgender_link -4.886e-03 1.100e-02 3.215e+03 -0.444 0.65691
ctrial_congruency:ctrial_gender:cgender_link -1.102e-03 2.199e-02 3.215e+03 -0.050 0.96004
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) ctrl_c ctrl_g cgndr_ ctrl_cngrncy:ct_ ctrl_cngrncy:cg_ ctrl_g:_
ctrl_cngrnc 0.000
ctrial_gndr 0.000 -0.002
cgender_lnk -0.001 0.001 0.001
ctrl_cngrncy:ct_ 0.000 0.000 0.002 0.000
ctrl_cngrncy:cg_ 0.001 -0.001 0.001 0.002 0.002
ctrl_gndr:_ 0.000 0.001 0.000 0.001 0.004 0.000
ctrl_cn:_:_ 0.000 0.003 0.004 0.000 -0.001 0.003 0.009
summary(gender_model)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: resid_rt ~ ctrial_congruency * ctrial_gender * cgender + (1 | workerid) + (1 | lexeme)
Data: all_data
REML criterion at convergence: 2474
Scaled residuals:
Min 1Q Median 3Q Max
-3.6179 -0.6286 -0.1283 0.4998 7.6343
Random effects:
Groups Name Variance Std.Dev.
workerid (Intercept) 0.06980 0.2642
lexeme (Intercept) 0.01415 0.1190
Residual 0.10242 0.3200
Number of obs: 3413, groups: workerid, 175; lexeme, 20
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 2.636e-04 3.371e-02 4.347e+01 0.008 0.994
ctrial_congruency 5.727e-02 1.098e-02 3.214e+03 5.215 1.95e-07 ***
ctrial_gender 1.040e-02 1.098e-02 3.215e+03 0.947 0.344
cgender 1.202e-02 3.832e-02 1.729e+02 0.314 0.754
ctrial_congruency:ctrial_gender -2.166e-02 2.198e-02 3.215e+03 -0.986 0.324
ctrial_congruency:cgender 5.750e-05 2.032e-02 3.214e+03 0.003 0.998
ctrial_gender:cgender -9.633e-03 2.036e-02 3.216e+03 -0.473 0.636
ctrial_congruency:ctrial_gender:cgender 3.230e-02 4.068e-02 3.215e+03 0.794 0.427
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) ctrl_c ctrl_g cgendr ctr_:_ ctrl_c: ctrl_g:
ctrl_cngrnc 0.000
ctrial_gndr 0.000 -0.002
cgender 0.000 -0.001 0.000
ctrl_cngr:_ 0.000 0.000 0.002 0.000
ctrl_cngrn: -0.001 0.002 -0.001 -0.001 -0.002
ctrl_gndr:c 0.000 0.000 0.001 0.001 -0.002 -0.005
ctrl_cng:_: 0.000 -0.001 -0.004 -0.001 0.000 0.004 -0.004
new_toy <- read.csv('maze_task_1-merged.csv') %>%
filter(trial_id!= 'example') %>%
filter(response_correct != 0) %>%
mutate(trial_gender = ifelse(condition=='neutral_female' | condition == 'congruent_female','female','male')) %>%
mutate(trial_congruency = ifelse(condition=='congruent_male' | condition == 'congruent_female','congruent','neutral'))
new_toy %>%
group_by(word_idx,trial_gender) %>%
summarize(MeanRT = mean(rt)) %>%
ggplot(aes(x=word_idx, y=log(MeanRT), color=trial_gender)) +
geom_line() +
geom_point()
`summarise()` has grouped output by 'word_idx'. You can override using the `.groups` argument.
new_toy %>%
group_by(word_idx,trial_gender) %>%
summarize(MeanRT = mean(rt)) %>%
ggplot(aes(x=word_idx, y=MeanRT, color=trial_gender)) +
geom_line() +
geom_point()
`summarise()` has grouped output by 'word_idx'. You can override using the `.groups` argument.
all_data %>%
filter(!is.na(poli_party)) %>%
group_by(poli_party) %>%
summarize(MeanGenderTotal = mean(gender_total)) %>%
ggplot(aes(x=poli_party, y=MeanGenderTotal)) +
geom_bar(stat = 'identity') +
theme_minimal() +
labs(y="Mean Gender Total", x="Political Party")
all_data %>%
filter(!is.na(subject_information.gender)) %>%
filter(subject_information.gender != '') %>%
group_by(subject_information.gender) %>%
summarize(MeanGenderTotal = mean(gender_total)) %>%
ggplot(aes(x=subject_information.gender, y=MeanGenderTotal)) +
geom_bar(stat = 'identity') +
theme_minimal() +
labs(y="Mean Gender Total", x="Participant Gender")
Rscript merge_results.R maze_task_1-merged.csv dems maze_task_2-merged.csv reps
all_data %>%
group_by(condition,trial_gender) %>%
filter(trial_no < 11) %>%
summarize(MeanRT = mean(resid_rt), CI.Low = ci.low(resid_rt), CI.High = ci.high(resid_rt)) %>%
mutate(YMin = MeanRT - CI.Low, YMax = MeanRT + CI.High) %>%
ggplot(aes(x=condition,y=MeanRT,color=trial_gender)) +
geom_point(size=3) +
geom_jitter(data = agg_speaker_mean_con, aes(y=MeanRT),alpha=.1,color='darkred') +
geom_errorbar(aes(ymin=YMin,ymax=YMax), width=.25) +
scale_color_manual(values = inauguration_2021) +
theme_minimal()
`summarise()` has grouped output by 'condition'. You can override using the `.groups` argument.
all_data %>%
filter(lexeme == "flight attendant") %>%
filter(!is.na(poli_party)) %>%
group_by(condition,trial_gender,trial_congruency,poli_party) %>%
summarize(MeanRT = mean(resid_rt), CI.Low = ci.low(resid_rt), CI.High = ci.high(resid_rt)) %>%
mutate(YMin = MeanRT - CI.Low, YMax = MeanRT + CI.High) %>%
ggplot(aes(x=condition,y=MeanRT,color=trial_gender,shape=trial_congruency)) +
geom_point(size=3) +
geom_errorbar(aes(ymin=YMin,ymax=YMax), width=.25) +
theme(axis.text.x = element_text(angle = 45, vjust = .7, hjust=.7)) +
scale_color_manual(values = inauguration_2021) +
facet_wrap(~poli_party)
`summarise()` has grouped output by 'condition', 'trial_gender', 'trial_congruency'. You can override using the `.groups` argument.
all_data %>%
filter(lexeme == "flight attendant") %>%
filter(!is.na(subject_information.gender)) %>%
group_by(condition,trial_gender,trial_congruency,subject_information.gender) %>%
summarize(MeanRT = mean(resid_rt), CI.Low = ci.low(resid_rt), CI.High = ci.high(resid_rt)) %>%
mutate(YMin = MeanRT - CI.Low, YMax = MeanRT + CI.High) %>%
ggplot(aes(x=condition,y=MeanRT,color=trial_gender,shape=trial_congruency)) +
geom_point(size=3) +
geom_errorbar(aes(ymin=YMin,ymax=YMax), width=.25) +
theme(axis.text.x = element_text(angle = 45, vjust = .7, hjust=.7)) +
scale_color_manual(values = inauguration_2021) +
facet_wrap(~subject_information.gender)
`summarise()` has grouped output by 'condition', 'trial_gender', 'trial_congruency'. You can override using the `.groups` argument.
Part of this experiment and analysis was also carried out as part of my class project in Stanford’s LINGUIST 245B ‘Methods in Psycholinguistics’ class, taught by Judith Degen.↩︎
This amounts to an hourly rate of $20.73. We originally anticipated that participants would take an average of 7 minutes to complete the experiment, and set the base pay at $15 an hour.↩︎
7 Comments